Meeting Pearls 1

home *** CD-ROM | disk | FTP | other *** search

/ Meeting Pearls 1 / Meeting Pearls Vol 1 (1994).iso / installed_progs / text / faqs / ai-faq.genetic.part3 < prev next >

Wrap

Internet Message Format | 1994-03-19 | 38.4 KB

Subject: FAQ: comp.ai.genetic part 3/6 (A Guide to Frequently Asked Questions) Newsgroups: comp.ai.genetic,comp.answers,news.answers From: David.Beasley@cf.cm.ac.uk (David Beasley) Date: Fri, 18 Mar 94 15:19:16 GMT Archive-name: ai-faq/genetic/part3 Last-Modified: 3/20/94 Issue: 2.1 TABLE OF CONTENTS OF PART 3 Q2: What applications of EAs are there? Q3: Who is concerned with EAs? Q4: How many EAs exist? Which? Q4.1: What about Alife systems, like Tierra and VENUS? Q5: What about all this Optimization stuff? ---------------------------------------------------------------------- Subject: Q2: What applications of EAs are there? In principle, EAs can compute any computable function, i.e. everything a normal digital computer can do. But EAs are especially badly suited for problems where efficient ways of solving them are already known, (unless these problems are intended to serve as benchmarks). Special purpose algorithms, i.e. algorithms that have a certain amount of problem domain knowledge hard coded into them, will usually outperform EAs, so there is no black magic in EC. EAs should be used when there is no other known problem solving strategy, and the problem domain is NP-complete. That's where EAs come into play: heuristically finding solutions where all else fails. Following is an incomplete (sic!) list of successful EA applications: TIMETABLING This has been addressed quite successfully with GAs. A very common manifestation of this kind of problem is the timetabling of exams or classes in Universities, etc. At the Department of Artificial Intelligence, University of Edinburgh, timetabling the MSc exams is now done using a GA (Corne et al. 93, Fang 92). An example of the use of GAs for timetabling classes is (Abramson & Abela 1991). In the exam timetabling case, the FITNESS function for a GENOME representing a timetable involves computing degrees of punishment for various problems with the timetable, such as clashes, instances of students having to take consecutive exams, instances of students having (eg) three or more exams in one day, the degree to which heavily-subscribed exams occur late in the timetable (which makes marking harder), overall length of timetable, etc. The modular nature of the fitness function has the key to the main potential strength of using GAs for this sort of thing as opposed to using conventional search and/or constraint programming methods. The power of the GA approach is the ease with which it can handle arbitrary kinds of constraints and objectives; all such things can be handled as weighted components of the fitness function, making it easy to adapt the GA to the particular requirements of a very wide range of possible overall objectives . Very few other timetabling methods, for example, deal with such objectives at all, which shows how difficult it is (without GAs) to graft the capacity to handle arbitrary objectives onto the basic "find shortest- length timetable with no clashes" requirement. The proper way to weight/handle different objectives in the fitness function in relation to the general GA dynamics remains, however, an important research problem! GAs thus offer a combination of simplicity, flexibility & speed which competes very favorably with other approaches, but are unlikely to outperform knowledge-based (etc) methods if the best possible solution is required at any cost. Even then, however, hybridisation may yield the best of both worlds; also, the ease (if the hardware is available!) of implementing GAs in parallel enhances the possibility of using them for good, fast solutions to very hard timetabling and similar problems. References Corne, D. Fang, H.-L. & Mellish, C. (1993) "Solving the Modular Exam Scheduling Problem with Genetic Algorithms". Proc. of 6th Int'l Conf. on Industrial and Engineering Applications of Artificial Intelligence & Expert Systems, ISAI, (to appear). Fang, H.-L. (1992) "Investigating GAs for scheduling", MSc Dissertation, University of Edinburgh Dept. of Artificial Intelligence, Edinburgh, UK. Abramson & Abela (1991) "A Parallel Genetic Algorithm for Solving the School Timetabling Problem", Technical Report, Division of I.T., C.S.I.R.O, April 1991. (Division of Information Technology, C.S.I.R.O., c/o Dept. of Communication & Electronic Engineering, Royal Melbourne Institute of Technology, PO BOX 2476V, Melbourne 3001, Australia) JOB-SHOP SCHEDULING The Job-Shop Scheduling Problem (JSSP) is a very difficult NP- complete problem which, so far, seems best addressed by sophisticated branch and bound search techniques. GA researchers, however, are continuing to make progress on it. (Davis 85) started off GA research on the JSSP, (Whitley 89) reports on using the edge RECOMBINATION operator (designed initially for the TSP) on JSSPs too. More recent work includes (Nakano 91),(Yamada & Nakano 92), (Fang et al. 93). The latter three report increasingly better results on using GAs on fairly large benchmark JSSPs (from Muth & Thompson 63); neither consistently outperform branch & bound search yet, but seem well on the way. A crucial aspect of such work (as with any GA application) is the method used to encode schedules. An important aspect of some of the recent work on this is that better results have been obtained by rejecting the conventional wisdom of using binary representations (as in (Nakano 91)) in favor of more direct encodings. In (Yamada & Nakano 92), for example, a GENOME directly encodes operation completion times, while in (Fang et al. 93) genomes represent implicit instructions for building a schedule. The success of these latter techniques, especially since their applications are very important in industry, should eventually spawn advances in GA theory. Concerning the point of using GAs at all on hard job-shop scheduling problems, the same goes here as suggested above for `Timetabling': The GA approach enables relatively arbitrary constraints and objectives to be incorporated painlessly into a single OPTIMIZATION method. It is unlikely that GAs will outperform specialized knowledge-based and/or conventional OR-based approaches to such problems in terms of raw solution quality, however GAs offer much greater simplicity and flexibility, and so, for example, may be the best method for quick high-quality solutions, rather than finding the best possible solution at any cost. Also, of course, hybrid methods will have a lot to offer, and GAs are far easier to parallelize than typical knowledge-based/OR methods. Similar to the JSSP is the Open Shop Scheduling Problem (OSSP). (Fang et al. 93) reports an initial attempt at using GAs for this. Ongoing results from the same source shows reliable achievement of results within less than 0.23% of optimal on moderately large OSSPs (so far, up to 20x20), including an improvement on the previously best known solution for a benchmark 10x10 OSSP. A simpler form of job shop problem is the Flow-Shop Sequencing problem; recent successful work on applying GAs to this includes (Reeves 93)." References Davis, L. (1985) "Job-Shop Scheduling with Genetic Algorithms", [ICGA85], 136-140. Muth, J.F. & Thompson, G.L. (1963) "Industrial Scheduling". Prentice Hall, Englewood Cliffs, NJ, 1963. Nakano, R. (1991) "Conventional Genetic Algorithms for Job-Shop Problems", [ICGA91], 474-479. Reeves, C.R. (1993) "A Genetic Algorithm for Flowshop Sequencing", Coventry Polytechnic Working Paper, Coventry, UK. Whitley, D., Starkweather, T. & D'Ann Fuquay (1989) "Scheduling Problems and Traveling Salesmen: The Genetic Edge Recombination Operator", [ICGA89], 133-140. Fang, H.-L., Ross, P., & Corne D. (1993) "A Promising Genetic Algorithm Approach to Job-Shop Scheduling, Rescheduling & Open-Shop Scheduling Problems", [ICGA93], 375-382. Yamada, T. & Nakano, R. (1992) "A Genetic Algorithm Applicable to Large-Scale Job-Shop Problems", [PPSN92], 281-290. MANAGEMENT SCIENCES "Applications of EA in management science and closely related fields like organizational ecology is a domain that has been covered by some EA researchers - with considerable bias towards scheduling problems. Since I believe that EA have considerable potential for applications outside the rather narrow domain of scheduling and related combinatorial problems, I started collecting references about the status quo of EA-applications in management science. This report intends to make available my findings to other researchers in the field. It is a short overview and lists some 230 references to current as well as finished research projects. [..] "At the end of the paper, a questionnaire has been incorporated that may be used for this purpose. Other comments are also appreciated." --- from the Introduction of (Nissen 93) References Nissen, V. (1993) "Evolutionary Algorithms in Management Science: An Overview and List of References", Papers on Economics and Evolution, edited by the European Study Group for Evolutionary Economics. This report is also avail. via anon. FTP from gwdu03.gwdg.de:/pub/msdos/reports/wi/earef.eps Boulding, K.E. (1991) "What is evolutionary economics?", Journal of Evolutionary Economics, 1, 9-17. GAME PLAYING GAs can be used to evolve behaviors for playing games. Work in evolutionary GAME THEORY typically surrounds the EVOLUTION of a POPULATION of players who meet randomly to play a game in which they each must adopt one of a limited number of moves. (Maynard-Smith 1982). Let's suppose it is just two moves, X and Y. The players receive a reward, analogous to Darwinian FITNESS, depending on which combination of moves occurs and which move they adopted. In more complicated models there may be several players and several moves. The players iterate such a game a series of times, and then move on to a new partner. At the end of all such moves, the players will have a cumulative payoff, their FITNESS. This fitness can then be used as a means of conducting something akin to Roulette-Wheel SELECTION to generate a new POPULATION. The real key in using a GA is to come up with an encoding to represent player's strategies, one that is amenable to CROSSOVER and to MUTATION. possibilities are to suppose at each iteration a player adopts X with some probability (and Y with one minus such). A player can thus be represented as a real number, or a bit-string by interpreting the decimal value of the bit string as the inverse of the probability. An alternative characterisation is to model the players as Finite State Machines, or Finite Automata (FA). These can be though of as a simple flow chart governing behaviour in the "next" play of the game depending upon previous plays. For example: 100 Play X 110 If opponent plays X go to 100 120 Play Y 130 If opponent plays X go to 100 else go to 120 Represents a strategy that does whatever its opponent did last, and begins by playing X, known as "Tit-For-Tat." (Axelrod 1982). Such machines can readily be encoded as bit-strings. Consider the encoding "1 0 1 0 0 1" to represent TFT. The first three bits, "1 0 1" are state 0. The first bit, "1" is interpreted as "Play X." The second bit, "0" is interpreted as "if opponent plays X go to state 1," the third bit, "1", is interpreted as "if the opponent plays Y, go to state 1." State 1 has a similar interpretation. Crossing over such bit-strings always yields valid strategies. SIMULATIONs in the Prisoner's dilemma have been undertaken (Axelrod 1987, Fogel 1993, Miller 1989) of these machines. Alternative representations of game players include CLASSIFIER SYSTEMs (Marimon, McGrattan and Sargent 1990, [GOLD89]), and Neural- networks (Fogel and Harrald 1994), though not necessarily with a GA. (Fogel 1993), and Fogel and Harrald 1994 use an Evolutionary Program). Other methods of evolving a POPULATION can be found in Lindgren 1991, Glance and Huberman 1993 and elsewhere. References. Axelrod, R. (1987) ``The EVOLUTION of Strategies in the Repeated Prisoner's Dilemma,'' in [DAVIS91] Miller, J.H. (1989) ``The Coevolution of Automata in the Repeated Prisoner's Dilemma'' Santa Fe Institute Working Paper 89-003. Marimon, Ramon, Ellen McGrattan and Thomas J. Sargent (1990) ``Money as a Medium of Exchange in an Economy with Artificially Intelligent Agents'' Journal of Economic Dynamics and Control 14, pp. 329--373. Maynard-Smith, (1982) EVOLUTION and the Theory of Games, CUP. Lindgren (1991) ``Evolutionary Phenomena in Simple Dynamics,'' in [ALIFEI]. Holland, J.H and John Miller (1990) ``Artificially Adaptive Agents in Economic Theory,'' American Economic Review: Papers and Proceedings of the 103rd Annual Meeting of the American Economics Association: 365--370. Huberman, Bernado, and Natalie S. Glance (1993) "Diversity and Collective Action" in H. Haken and A. Mikhailov (eds.) Interdisciplinary Approaches to Nonlinear Systems, Springer. Fogel (1993) "Evolving Behavior in the Iterated Prisoner's Dilemma" EVOLUTIONARY COMPUTATION 1:1 Fogel, D.B. and Paul Harrald (1994) ``Evolving Complex Behaviour in the Iterated Prisoner's Dilemma,'' Proceedings of the Fourth Annual Meetings of the EVOLUTIONARY PROGRAMMING Society, L.J. Fogel and A.W. Sebald eds., World Science Press. ------------------------------ Subject: Q3: Who is concerned with EAs? EVOLUTIONARY COMPUTATION attracts researchers and people of quite dissimilar disciplines, i.e. EC is a interdisciplinary research field: Computer scientists Want to find out about the properties of sub-symbolic information processing with EAs and about learning, i.e. adaptive systems in general. They also build the hardware necessary to enable future EAs (precursors are already beginning to emerge) to huge real world problems, i.e. the term "massively parallel computation" [HILLIS92], springs to mind. Engineers Of many kinds want to exploit the capabilities of EAs on many areas to solve their application, esp. OPTIMIZATION problems. Roboticists Want to build MOBOTs (MOBile ROBOTs, i.e. R2D2's and #5's cousins) that navigate through uncertain ENVIRONMENTs, without using built-in "maps". The MOBOTS thus have to adapt to their surroundings, and learn what they can do "move-through-door" and what they can't "move- through-wall" on their own by "trial-and-error". Cognitive scientists Might view CFS as a possible apparatus to describe models of thinking and cognitive systems. Physicists Use EC hardware, e.g. Hillis' (Thinking Machine Corp.'s) Connection Machine to model real world problems which include thousands of variables, that run "naturally" in parallel, and thus can be modelled more easily and esp. "faster" on a parallel machine, than on a serial "PC" one. Biologists In fact many working biologists are hostile to modeling, but an entire community of Population Biologists arose with the 'evolutionary synthesis' of the 1930's created by the polymaths R.A. Fisher, J.B.S. Haldane, and S. Wright. Wright's SELECTION in small POPULATIONs, thereby avoiding local optima) is of current interest to both biologists and ECers -- populations are naturally parallel. A good exposition of current POPULATION Biology modeling is J. Maynard Smith's text Evolutionary Genetics. Richard Dawkin's Selfish Gene and Extended Phenotype are unparalleled (sic!) prose expositions of evolutionary processes. Rob Collins' papers are excellent parallel GA models of evolutionary processes (available in [ICGA91] and by FTP from ftp.cognet.ucla.edu:/pub/alife/papers/ ). As fundamental motivation, consider Fisher's comment: "No practical biologist interested in (e.g.) sexual REPRODUCTION would be led to work out the detailed consequences experienced by organisms having three or more sexes; yet what else should [s/]he do if [s/]he wishes to understand why the sexes are, in fact, always two?" (Three sexes would make for even weirder grammar, [s/]he said...) Philosophers and some other really curious people may also be interested in EC for various reasons. ------------------------------ Subject: Q4: How many EAs exist? Which? The All Stars There are currently 3 main paradigms in EA research: GENETIC ALGORITHMs, EVOLUTIONARY PROGRAMMING, and EVOLUTION STRATEGIEs. CLASSIFIER SYSTEMs and GENETIC PROGRAMMING are OFFSPRING of the GA community. Besides this leading crop, there are numerous other different approaches, alongside hybrid experiments, i.e. there exist pieces of software residing in some researchers computers, that have been described in papers in conference proceedings, and may someday prove useful on certain tasks. To stay in EA slang, we should think of these evolving strands as BUILDING BLOCKs, that when recombined someday, will produce new offspring and give birth to new EA paradigm(s). Promising Rookies As far as "solving complex function and COMBINATORIAL OPTIMIZATION tasks" is concerned, Davis' work on real-valued representations and adaptive operators should be mentioned (Davis 89). Moreover Whitley's Genitor system incorporating ranking and "steady state" mechanism (Whitley 89), Goldberg's "messy GAs", involves adaptive representations (Goldberg 91), and Eshelman's CHC algorithm (Eshelman 91). For "the design of robust learning systems", i.e. the field characterized by CFS, Holland's (1986) CLASSIFIER SYSTEM, with it's state-of-the-art implementation CFS-C (Riolo 88), we should note recent developments in SAMUEL (Grefenstette 89), GABIL (De Jong & Spears 91), and GIL (Janikow 91). References Davis, L. (1989) "Adapting operator probabilities in genetic algorithms", [ICGA89], 60-69. Whitley, D. et al. (1989) "The GENITOR algorithm and SELECTION pressure: why rank-based allocation of reproductive trials is best", [ICGA89], 116-121. Goldberg, D. et al. (1991) "Don't worry, be messy", [ICGA91], 24-30. Eshelman, L.J. et al. (1991) "Preventing premature convergence in GENETIC ALGORITHMs by preventing incest", [ICGA91], 115-122. Holland, J.H. (1986) "Escaping brittleness: The possibilities of general-purpose learning algorithms applied to parallel rule-based systems". In R. Michalski, J. Carbonell, T. Mitchell (eds), Machine Learning: An ARTIFICIAL INTELLIGENCE Approach. Los Altos: Morgan Kaufmann. Riolo, R.L. (1988) "CFS-C: A package of domain independent subroutines for implementing CLASSIFIER SYSTEMs in arbitrary, user- defined environments". Logic of computers group, Division of computer science and engineering, University of Michigan. Grefenstette, J.J. (1989) "A system for learning control strategies with genetic algorithms", [ICGA89], 183-190. De Jong K.A. & Spears W. (1991) "Learning concept classification rules using genetic algorithms". Proc. 12th IJCAI, 651-656, Sydney, Australia: Morgan Kaufmann. Janikow C. (1991) "Inductive learning of decision rules from attribute-based examples: A knowledge-intensive GENETIC ALGORITHM approach". TR91-030, The University of North Carolina at Chapel Hill, Dept. of Computer Science, Chapel Hill, NC. ------------------------------ Subject: Q4.1: What about Alife systems, like Tierra and VENUS? None of these are Evolutionary Algorithms, but all of them use the evolutionary metaphor as their "playing field". Tierra Synthetic organisms have been created based on a computer metaphor of organic life in which CPU time is the ``energy'' resource and memory is the ``material'' resource. Memory is organized into informational patterns that exploit CPU time for self-replication. MUTATION generates new forms, and EVOLUTION proceeds by natural SELECTION as different GENOTYPEs compete for CPU time and memory space. Observation of nature shows that EVOLUTION by natural SELECTION is capable of both OPTIMIZATION and creativity. Artificial models of evolution have demonstrated the optimizing ability of evolution, as exemplified by the field of GENETIC ALGORITHMs. The creative aspects of evolution have been more elusive to model. The difficulty derives in part from a tendency of models to specify the meaning of the ``genome'' of the evolving entities, precluding new meanings from emerging. I will present a natural model of evolution demonstrating both optimization and creativity, in which the GENOME consists of sequences of executable machine code. From a single rudimentary ancestral ``creature'', very quickly there evolve parasites, which are not able to replicate in isolation because they lack a large portion of the GENOME. However, these parasites search for the missing information, and if they locate it in a nearby creature, parasitize the information from the neighboring genome, thereby effecting their own replication. In some runs, hosts evolve immunity to attack by parasites. When immune hosts appear, they often increase in frequency, devastating the parasite POPULATIONs. In some runs where the community comes to be dominated by immune hosts, parasites evolve that are resistant to immunity. Hosts sometimes evolve a response to parasites that goes beyond immunity, to actual (facultative) hyper-parasitism. The hyper- parasite deceives the parasite causing the parasite to devote its energetic resources to replication of the hyper-parastie GENOME. This drives the parasites to extinction. Evolving in the absence of parasites, hyper-parasites completely dominate the community, resulting in a relatively uniform community characterized by a high degree of relationship between INDIVIDUALs. Under these circumstances, sociality evolves, in the form of creatures which can only replicate in aggregations. The cooperative behavior of the social hyper-parasites makes them vulnerable to a new class of parasites. These cheaters, hyper-hyper- parasites, insert themselves between cooperating social INDIVIDUALs, deceiving the social creatures, causing them to replicate the GENOMEs of the cheaters. The only genetic change imposed on the simulator is random bit flips in the machine code of the creatures. However, it turns out that parasites are very sloppy replicators. They cause significant RECOMBINATION and rearrangement of the GENOMEs. This spontaneous sexuality is a powerful force for evolutionary change in the system. One of the most interesting aspects of this instance of life is that the bulk of the EVOLUTION is based on adaptation to the biotic ENVIRONMENT rather than the physical environment. It is co-evolution that drives the system. --- "Tierra announcement" by Tom Ray (1991) How to get Tierra? The complete source code and documentation (but not executables) is available by anonymous FTP at: tierra.slhs.udel.edu:/ (128.175.41.34) and life.slhs.udel.edu:/ (128.175.41.33) in the directories: almond/, beagle/, doc/, and tierra/. If you do not have FTP access you may obtain everything on DOS disks. For details, write to: Virtual Life, 25631 Jorgensen Rd., Newman, CA 95360. References Ray, T. S. (1991) "Is it alive, or is it GA?" in [ICGA91], 527--534. Ray, T. S. (1991) "An approach to the synthesis of life." in [ALIFEII], 371--408. Ray, T. S. (1991) "Population dynamics of digital organisms." in [ALIFEII-V]. Ray, T. S. (1991) "Evolution and OPTIMIZATION of digital organisms." Scientific Excellence in Supercomputing: The IBM 1990 Contest Prize Papers, Eds. Keith R. Billingsley, Ed Derohanes, Hilton Brown, III. Athens, GA, 30602, The Baldwin Press, The University of Georgia. Ray, T. S. (1992) "Evolution, ecology and OPTIMIZATION of digital organisms." Santa Fe Institute working paper 92-08-042. Ray, T. S. "Evolution, complexity, entropy, and artificial reality." submitted Physica D. Avail. as tierra.slhs.udel.edu:/doc/PhysicaD.tex Ray, T. S. (1993) "An evolutionary approach to synthetic biology, Zen and the art of creating life. ARTIFICIAL LIFE 1(1): (in press). Avail. as tierra.slhs.udel.edu:/doc/Zen.tex VENUS Steen Rasmussen's (et al.) VENUS I+II "coreworlds" as described in [ALIFEII] and [LEVY92], are inspired by A.K. Dewdney's well-known article (Dewdney 1984). Dewdney proposed a game called "Core Wars", in which hackers create computer programs that battle for control of a computer's "core" memory (Strack 93). Since computer programs are just patterns of information, a successful program in core wars is one that replicates its pattern within the memory, so that eventually most of the memory contains its pattern rather than that of the competing program. VENUS is a modification of Core Wars in which the Computer programs can mutate, thus the pseudo assembler code creatures of VENUS evolve steadily. Furthermore each memory location is endowed with "resources" which, like sunshine are added at a steady state. A program must have sufficient resources in the regions of memory it occupies in order to execute. The input of resources determines whether the VENUS ecosystem is a "jungle" or a "desert." In jungle ENVIRONMENTs, Rasmussen et al. observe the spontaneous emergence of primitive "copy/split" organisms starting from (structured) random initial conditions. --- [ALIFEII], p.821 Dewdney, A.K. (1984) "Computer Recreations: In the Game called Core War Hostile Programs Engage in a Battle of Bits", Sci. Amer. 250(5), 14-22. Farmer & Belin (1992) "Artificial Life: The Coming Evolution", [ALIFEII], 815-840. Rasmussen, et al. (1990) "The Coreworld: Emergence and EVOLUTION of Cooperative Structures in a Computational Chemistry", [FORREST90], 111-134. Rasmussen, et al. (1992) "Dynamics of Programmable Matter", [ALIFEII], 211-254. Strack (1993) "Core War Frequently Asked Questions (rec.games.corewar FAQ)" Avail. by anon. FTP from rtfm.mit.edu:/pub/usenet/news.answers/games/corewar-faq.Z PolyWorld Larry Yaeger's PolyWorld as described in [ALIFEIII] and [LEVY92] is available via anonymous FTP from ftp.apple.com:/pub/polyworld/ "The subdirectories in this "polyworld" area contain the source code for the PolyWorld ecological simulator, designed and written by Larry Yaeger, and Copyright 1990, 1991, 1992 by Apple Computer. PostScript versions of my ARTIFICIAL LIFE III technical paper have now been added to the directory. These should be directly printable from most machines. Because some unix systems' "lpr" commands cannot handle very large files (ours at least), I have split the paper into Yaeger.ALife3.1.ps and Yaeger.ALife3.2.ps. These files can be ftp-ed in "ascii" mode. For unix users I have also included compressed versions of both these files (indicated by the .Z suffix), but have left the uncompressed versions around for people connecting from non- unix systems. I have not generated PostScript versions of the images, because they are color and the resulting files are much too large to store, retrieve, or print. Accordingly, though I have removed a Word-formatted version of the textual body of the paper that used to be here, I have left a Word-formatted version of the color images. If you wish to acquire it, you will need to use the binary transfer mode to move it to first your unix host and then to a Macintosh (unless Word on a PC can read it - I don't know), and you may need to do something nasty like use ResEdit to set the file type and creator to match those of a standard Word document (Type = WDBN, Creator = MSWD). [..]" --- from the README by Larry Yaeger <larryy@apple.com> General Alife repositories? Also, all of the following FTP sites carry ALIFE related info: ftp.cognet.ucla.edu:/pub/alife/ life.anu.edu.au:/pub/complex_systems/alife/ ftp.cogs.susx.ac.uk:/pub/reports/csrp/ xyz.lanl.gov:/nlin-sys/ ------------------------------ Subject: Q5: What about all this Optimization stuff? Just think of an OPTIMIZATION problem as a black box. A large black box. As large as, for example, a Coca-Cola vending machine. Now, we don't know nothing about the inner workings of this box, but see, that there are some regulators to play with, and of course we know, that we want to have a bottle of the real thing... Putting this everyday problem into a mathematical model, we proceed as follows: (1) we label all the regulators with x and a number starting from 1; the result is a vector x, i.e. (x_1,...,x_n), where n is the number of visible regulators. (2) we must find an objective function, in this case it's obvious, we want to get k bottles of the real thing, where k is equal to 1. [some might want a "greater or equal" here, but we restricted ourselves to the visible regulators (we all know that sometimes a "kick in the right place" gets use more than 1, but I have no idea how to put this mathematically...)] (3) thus, in the language some mathematicians prefer to speak in: f(x) = k = 1. So, what we have here is a maximization problem presented in a form we know from some boring calculus lessons, and we also know that there at least a dozen utterly uninteresting techniques to solve problems presented this way... What can we do in order to solve this problem? We can either try to gain more knowledge or exploit what we already know about the interior of the black box. If the objective function turns out to be smooth and differentiable, analytical methods will produce the exact solution. If this turns out to be impossible, we might resort to the brute force method of enumerating the entire SEARCH SPACE. But with the number of possibilities growing exponentially in n, the number of dimensions (inputs), this method becomes infeasible even for low- dimensional spaces. Consequently, mathematicians have developed theories for certain kinds of problems leading to specialized OPTIMIZATION procedures. These algorithms perform well if the black box fulfils their respective prerequisites. For example, Dantzig's simplex algorithm (Dantzig 66) probably represents the best known multidimensional method capable of efficiently finding the global optimum of a linear, hence convex, objective function in a SEARCH SPACE limited by linear constraints. (A USENET FAQ on linear programming is maintained by John W. Gregory of Cray Research, Inc. Try to get your hands on "linear-programming-faq" (and "nonlinear-programming-faq") that is posted monthly to sci.op-research and is mostly interesting to read.) Gradient strategies are no longer tied to these linear worlds, but they smooth their world by exploiting the objective function's first partial derivatives one has to supply in advance. Therefore, these algorithms rely on a locally linear internal model of the black box. Newton strategies additionally require the second partial derivatives, thus building a quadratic internal model. Quasi-Newton, conjugate gradient and variable metric strategies approximate this information during the search. The deterministic strategies mentioned so far cannot cope with deteriorations, so the search will stop if anticipated improvements no longer occur. In a multimodal ENVIRONMENT these algorithms move "uphill" from their respective starting points. Hence, they can only converge to the next local optimum. Newton-Raphson-methods might even diverge if a discrepancy between their internal assumptions and reality occurs. But of course, these methods turn out to be superior if a given task matches their requirements. Not relying on derivatives, polyeder strategy, pattern search and rotating coordinate search should also be mentioned here because they represent robust non-linear OPTIMIZATION algorithms (Schwefel 81). Dealing with technical OPTIMIZATION problems, one will rarely be able to write down the objective function in a closed form. We often need a SIMULATION model in order to grasp reality. In general, one cannot even expect these models to behave smoothly. Consequently, derivatives do not exist. That is why optimization algorithms that can successfully deal with black box-type situations habe been developed. The increasing applicability is of course paid for by a loss of "convergence velocity," compared to algorithms specially designed for the given problem. Furthermore, the guarantee to find the global optimum no longer exists! But why turn to nature when looking for more powerful algorithms? In the attempt to create tools for various purposes, mankind has copied, more often instinctively than geniously, solutions invented by nature. Nowadays, one can prove in some cases that certain forms or structures are not only well adapted to their ENVIRONMENT but have even reached the optimum (Rosen 67). This is due to the fact that the laws of nature have remained stable during the last 3.5 billion years. For instance, at branching points the measured ratio of the diameters in a system of blood-vessels comes close to the theoretical optimum provided by the laws of fluid dynamics (2^-1/3). This, of course, only represents a limited, engineering point of view on nature. In general, nature performs adaptation, not optimization. The idea to imitate basic principles of natural processes for optimum seeking procedures emerged more than three decades ago (cf Q10, 3. Classics). Although these algorithms have proven to be robust and direct OPTIMIZATION tools, it is only in the last five years that they have caught the researchers' attention. This is due to the fact that many people still look at organic EVOLUTION as a giantsized game of dice, thus ignoring the fact that this model of evolution cannot have worked: a human germ-cell comprises approximately 50,000 GENEs, each of which consists of about 300 triplets of nucleic bases. Although the four existing bases only encode 20 different amino acids, 20^15,000,000, ie circa 10^19,500,000 different GENOTYPEs had to be tested in only circa 10^17 seconds, the age of our planet. So, simply rolling the dice could not have produced the diversity of today's complex living systems. Accordingly, taking random samples from the high-dimensional parameter space of an objective function in order to hit the global optimum must fail (Monte-Carlo search). But by looking at organic EVOLUTION as a cumulative, highly parallel sieving process, the results of which pass on slightly modified into the next sieve, the amazing diversity and efficiency on earth no longer appears miraculous. When building a model, the point is to isolate the main mechanisms which have led to today's world and which have been subjected to evolution themselves. Inevitably, nature has come up with a mechanism allowing INDIVIDUALs of one SPECIES to exchange parts of their genetic information (RECOMBINATION or CROSSOVER), thus being able to meet changing environmental conditions in a better way. Dantzig, G.B. (1966) "Lineare Programmierung und Erweiterungen", Berlin: Springer. (Linear pogramming and extensions) Kursawe, F. (1994) " Evolution strategies: Simple models of natural processes?", Revue Internationale de Systemique, France (to appear). Rosen, R. (1967) "Optimality Principles in Biologie", London: Butterworth. Schwefel, H.-P. (1981) "Numerical OPTIMIZATION of Computer Models", Chichester: Wiley. ------------------------------ End of ai-faq/genetic/part3 ***************************